AITopics

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Neural Information Processing SystemsFeb-8-2026, 18:16:28 GMT

Federated Fine-tuning of Large Language Models under Heterogeneous Tasks and Client Resources Jiamu Bai

Federated Learning (FL) has recently been applied to the parameter-efficient fine-tuning of Large Language Models (LLMs).

large language model, machine learning, natural language, (17 more...)

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Artificial IntelligenceOct-28-2025

Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation

Li, Shiwei, Luo, Xiandi, Wang, Haozhao, Tang, Xing, Cui, Ziqiang, Liu, Dugang, Li, Yuhua, He, Xiuqiang, Li, Ruixuan

Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). LoRA essentially describes the projection of an input space into a low-dimensional output space, with the dimensionality determined by the LoRA rank. In standard LoRA, all input tokens share the same weights and undergo an identical input-output projection. This limits LoRA's ability to capture token-specific information due to the inherent semantic differences among tokens. To address this limitation, we propose Token-wise Projected Low-Rank Adaptation (TopLoRA), which dynamically adjusts LoRA weights according to the input token, thereby learning token-wise input-output projections in an end-to-end manner. Formally, the weights of TopLoRA can be expressed as $BΣ_X A$, where $A$ and $B$ are low-rank matrices (as in standard LoRA), and $Σ_X$ is a diagonal matrix generated from each input token $X$. Notably, TopLoRA does not increase the rank of LoRA weights but achieves more granular adaptation by learning token-wise LoRA weights (i.e., token-wise input-output projections). Extensive experiments across multiple models and datasets demonstrate that TopLoRA consistently outperforms LoRA and its variants. The code is available at https://github.com/Leopold1423/toplora-neurips25.

artificial intelligence, large language model, natural language, (18 more...)

2510.23123

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Dhar, Sauptik, Ramakrishnan, Naveen, Munson, Michelle

Breakdance Video classification in the age of Generative AI

arXiv.org Artificial IntelligenceOct-24-2025

Large Vision Language models have seen huge application in several sports use-cases recently. Most of these works have been targeted towards a limited subset of popular sports like soccer, cricket, basketball etc; focusing on generative tasks like visual question answering, highlight generation. This work analyzes the applicability of the modern video foundation models (both encoder and decoder) for a very niche but hugely popular dance sports - breakdance. Our results show that Video Encoder models continue to outperform state-of-the-art Video Language Models for prediction tasks. We provide insights on how to choose the encoder model and provide a thorough analysis into the workings of a finetuned decoder model for breakdance video classification.

classification, machine learning, natural language, (18 more...)

2510.20287

Genre: Research Report > New Finding (0.54)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Abdalla, M. H. I., Wang, Zhipin, Frey, Christian, Eger, Steffen, Grabocka, Josif

Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-Tuning

arXiv.org Artificial IntelligenceOct-24-2025

Large Language Model (LLM) conditioning refers to instructing an LLM to generate content in accordance with the norms and values of a specific culture, beliefs of a particular political orientation, or any desired text-specified semantic conditioning. Unfortunately, prompt engineering does not ensure that LLMs behave in accordance with a desired conditioning due to the inductive bias of the pre-training and alignment datasets. Prior works have focused on fine-tuning LLMs by directly conditioning the LoRA weights; however, such methods introduce a large number of parameters. As a remedy, we propose Zhyper, a parameter-efficient factorized hypernetwork framework that generates context-aware LoRA adapters from textual descriptions. Experiments on multiple benchmarks show that Zhyper achieves competitive performance with up to 26x fewer parameters than the state-of-the-art baselines. Furthermore, we extend Zhyper to cultural alignment, demonstrating improved generalization to out-of-domain settings and a better capturing of fine-grained contextual values. Large Language Models (LLMs) have transformed Natural Language Processing (NLP), Computer Vision (CV), and machine learning (ML) more broadly. They achieve state-of-the-art performance in text generation and comprehension across diverse domains, including code synthesis (Rozi ` ere et al., 2023), mathematical reasoning (Ahn et al., 2024), scientific writing (Geng et al., 2025; Eger et al., 2025), multimodal tasks such as text-image understanding and generation (Alayrac et al., 2022), and evaluation of machine translation and related tasks (Gu et al., 2025). This success stems from scaling to millions and billions of parameters. However, this scaling requires large computational resources, motivating the search for parameter-efficient fine-tuning (PEFT) techniques. Recent advances have made it possible to adapt LLMs to task-specific criteria, which is crucial for a broader applicability and acceptance of NLP systems. A recent stream of research leverages PEFT techniques (Ding et al., 2023; Weyssow et al., 2023; Prottasha et al., 2024), e.g., Low-Rank Adaptions (LoRA) (Hu et al., 2021) to adapt for desired task-specific values in an LLM. LoRA achieves this by freezing most of the pre-trained model's parameters and introducing trainable low-rank matrices, yielding weight correction terms. However, stand-alone LoRA approaches are primarily tailored for a single-task adaptation and may lose their effectiveness in a setting where an LLM needs to be adapted to various downstream settings.

computational linguistic, large language model, natural language, (17 more...)

2510.19733

Country:

Europe (1.00)
Asia (1.00)
South America (0.94)
(2 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceOct-24-2025

TianHui: A Domain-Specific Large Language Model for Diverse Traditional Chinese Medicine Scenarios

Yin, Ji, He, Menglan, Zhang, Yujie, Zhang, Linshuai, Ma, Tingting, Tian, Ce, Wu, Jie, Xu, Lin, Jiang, Tao

Background: Currently, domain - specific large language models (LLMs) in traditional Chinese medicine (TCM) are primarily designed for clinical practice and medical education, yet they demonstrate substantial limitations when applied to research contexts owing to inadeq uate adaptability to complex tasks, thereby constraining their scientific utility. Moreover, the absence of comprehensive evaluation datasets and computational resource constraints hinder rigorous performance assessments and prevent extensive comparative o r ablation experiments, ultimately resulting in suboptimal model performance and weakened persuasiveness. Objective: To address these challenges, this study proposed a method for constructing a specialized LLM for the TCM domain based on contextual data integration and domain knowledge fusion and successfully developed a privatized LLM for the TCM profession, TianHui. Methods: Firstly, we acquired a large amount of TCM data, including academic literature resources, published book materials, online public data, and other supplementary materials, and pre - processed them to finally generate the 0.97G unsupervised dataset and 611312 QAs. Then, we adopted a phased training strategy (Pre - Training (PT) and Supervised Fine - Tuning (SFT)) and integrated three key technologies, Quantized Low - Rank Adaptation (QLoRA) parameter efficient fine - tuning, DeepSpeed Stage 2 distributed traini ng optimization, and Flash Attention 2 accelerated computation, to achieve optimal allocation of computational resources while guaranteeing training stability. Finally, we evaluated TianHui using 12 different types of benchmark test datasets and conducted extensive comparison experiments and ablation experiments. Results: The benchmark test data showed that TianHui demonstrated excellent performance in 12 TCM - related application scenarios. It ranked in the top three in each evaluation index in six test datasets: APQ, TCMCD, HFR, HCCA, DHPE, and TLAW. Meanwhile, it achieved optimal performance in all indicators of the six test data sets: TCMEE, APR, GCPMI, TCMKQA, TCMRC, and ADTG.

large language model, machine learning, tianhui, (18 more...)

2509.19834

Country: Asia > China (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.46)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yang, Ruo, Mudhiganti, Sai Krishna Reddy, Sharma, Manali

PatentVision: A multimodal method for drafting patent applications

arXiv.org Artificial IntelligenceOct-14-2025

Patent drafting is complex due to its need for detailed technical descriptions, legal compliance, and visual elements. Although Large Vision Language Models (LVLMs) show promise across various tasks, their application in automating patent writing remains underexplored. In this paper, we present PatentVision, a multimodal framework that integrates textual and visual inputs such as patent claims and drawings to generate complete patent specifications. Built on advanced LVLMs, PatentVision enhances accuracy by combining fine tuned vision language models with domain specific training tailored to patents. Experiments reveal it surpasses text only methods, producing outputs with greater fidelity and alignment with human written standards. Its incorporation of visual data allows it to better represent intricate design features and functional connections, leading to richer and more precise results. This study underscores the value of multimodal techniques in patent automation, providing a scalable tool to reduce manual workloads and improve consistency. PatentVision not only advances patent drafting but also lays the groundwork for broader use of LVLMs in specialized areas, potentially transforming intellectual property management and innovation processes.

large language model, machine learning, patentvision, (20 more...)

2510.09762

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsOct-9-2025, 21:31:04 GMT

28312c9491d60ed0c77f7fff4ad86dd1-Paper-Conference.pdf

fine-tuning, lora module, module, (14 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Data Science > Data Mining (0.68)

Neural Information Processing SystemsOct-9-2025, 19:56:06 GMT

1a134b50202088aa8c595cc99b310e5a-Paper-Conference.pdf

experiment, flexlora, lora rank, (14 more...)

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science (0.93)

Salles, Marcel Mateos, Goyal, Praney, Sekhsaria, Pradyut, Huang, Hai, Balestriero, Randall

LoRA Users Beware: A Few Spurious Tokens Can Manipulate Your Finetuned Model

arXiv.org Artificial IntelligenceOct-2-2025

Large Language Models (LLMs) are commonly finetuned for a variety of use cases and domains. A common approach is to leverage Low-Rank Adaptation (LoRA) -- known to provide strong performance at low resource costs. In this study, we demonstrate that LoRA actually opens the door to short-cut vulnerabilities -- and the more resource efficient is the LoRA setup, the more vulnerable will be the finetuned model to aggressive attacks. To measure that vulnerability, we introduce Seamless Spurious Token Injection (SSTI), where we find that LoRA exclusively focuses on even just a single token that is spuriously correlated with downstream labels. In short, injection of that spurious token during finetuning ensure that the model's prediction at test-time can be manipulated on-demand. We conducted experiments across model families and datasets to evaluate the impact of SSTI during LoRA finetuning while providing possible mitigations. Our experiments conclude that none of the existing checkers and preprocessors can sanitize a dataset raising new concerns for data quality and AI safety.

large language model, machine learning, ssti, (18 more...)

2506.11402

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)